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ABSTRACT 

Cosmological inference becomes increasingly difficult when complex data-generating 
processes cannot be modeled by simple probability distributions. With the ever- 
increasing size of data sets in cosmology, there is increasing burden placed on adequate 
modeling; systematic errors in the model will dominate where previously these were 
swamped by statistical errors. For example, Gaussian distributions are an insufficient 
representation for errors in quantities like photometric redshifts. Likewise, it can be 
difficult to quantify analytically the distribution of errors that are introduced in com- 
plex fitting codes. Without a simple form for these distributions, it becomes difficult 
to accurately construct a likelihood function for the data as a function of parameters 
of interest. Approximate Bayesian computation (ABC) provides a means of probing 
the posterior distribution when direct calculation of a sufficiently accurate likelihood is 
intractable. ABC allows one to bypass direct calculation of the likelihood but instead 
relies upon the ability to simulate the forward process that generated the data. These 
simulations can naturally incorporate priors placed on nuisance parameters, and hence 
these can be marginalized in a natural way. We present and discuss ABC methods in 
the context of supernova cosmology using data from the SDSS-II Supernova Survey. 
Assuming a flat cosmology and constant dark energy equation of state we demonstrate 
that ABC can recover an accurate posterior distribution. Finally we show that ABC 
can still produce an accurate posterior distribution when we contaminate the sample 
with Type IIP supernovae. 

Subject headings: supernova,cosmology,techniques 



1. Introduction 



Since the discovery of the accelerated expansion of our universe (jRiess et al.lll998l ; iPerlmutter et al 

1999l ). the quality of Type la supernova (SN la) data sets has improved and the quantity has 
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grown to thousands through individual effor ts with the Hubble Space Telescope (jKnop et al 



2003 



Riess et al 



2004 



Amanullah et alJ boiol) and surveys such as the Sup ernova Legacy Survey 



dAstier et al.l bood : IConlev et alJ boilh. the ESSEN CE Supernova Surve y dlVliknaitis et al.l bo07l: 



Wood-Vasev et al.ll2007r). the CfA Supe rnova group (IHicken et alJ 12009. 120121). the Carnegie Su 



pernova Project (IContreras et al.l |201Q| ; IStritzinger et al 



2011 



the Sloan Digital Sky Survey-I I 



(jLampeitl et al.ll2010l ). and the Lick Observatory Supernova Search (jGaneshalingam et al.1 120 10), 



Addi tional current and near- future surveys such as the Palomar Transient Factory 1 ( Law et al 



2009), the Panoramic Survey Telescope and Rapid Response System (Pan-STARRSjj, SkyMap- 
peijj, and the Dark Energy Surve50 will increase the sample by another order of magnitude with the 
goal of obtaining tighter constraints on the nature of dark energy. The Large Synoptic Survey Tele- 
scope (LS ST) anticipates observing hundreds of t housands of well-measured Type la supernovae 
(SNe la) (|LSST Science Collaborations et al.l bood ) . 

In this new regime of large numbers of SNe la the weaknesses and limitations of our current x 2 
likelihood approach to estimating cosmological parameters are becoming apparent. For example, 
with limited spectroscopic follow-up, we must rely on light-curve classification codes and photomet- 
ric redshift tools to maximize the scientific potential of SN la cosmology with LSST and near-future 
surveys. These two crucial steps alone introduce a nontrivial component to our probability models 
from which we construct the likelihood. Additionally, there are significant systematic uncertainties 
including errors from calibration, survey design and cadence, host galaxy subtraction and intrinsic 
dust, population evolution, gravitational lensing, and peculiar velocities. All of these uncertainties 
contribute to a probability model which simply cannot be accurately described by a multivariate 
normal distribution. 

In this paper we describe how the statistical technique of Approximate Bayesian Computation 
(ABC) can be used to overcome these challenges and explore the space of cosmological parameters 
in the face of non-Gaussian distributions of systematic uncertainties, co mplicated functional priors , 
and large data sets. We encourage the reader to read the recent paper bv lCameron h Pettittl (|2012l ) 
for an introduction to and application of ABC in the context of galaxy evolution. We here focus 
on supernova cosmology, but ABC has applicability in a wide range of forward-modeling problems 
in astrophysics and cosmology. 



: http: //www. astro . caltech.edu/ptf/ 

//pan- Starrs . if a.hawaii . edu/public/ 



//www .mso . anu.edu. au/ skymapper/ 
//www . darkenergy survey . org/ 
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1.1. Classical Estimation of Cosmological Parameters from SN la Data 



Cosmological inference with SNe la is a classical statistical estimation problem. We have data, 
our set of supernova light-curve observations, and we seek to infer something about the universe in 
which we live. It is standard in cosmology to adopt a Bayesian approach to inference. To clarify 
our basic conceptual and notational framework, we review Bayes theorem, a simple identity which 
relates the posterior probability distribution-the probability of a set of model parameters given the 
data-to the probability of the data given the model, the likelihood. More precisely, the posterior 
probability distribution is derived as 



7T 



X 



p(x\ 0)vr(0) 
p(x) 



(1) 



where p(x \ 8) is the likelihood, tt(9) is the prior on the vector of model parameters 8, and p(x) is the 
marginal probability of the data x (p(x) = J Q p(x| 8)ir(8)d8). The Bayesian framework is powerful 
in that it allows evidence and experience to modify the prior. The approach is challenging, however, 
in that standard computation methods rely upon full specification of the likelihood p(x| 8); this 
can be challenging in applications of interest. 

For example, consider a cosmological model for which the distance modulus can be written as 
yUmodei = A*modei(^M) ^A; w, z). If we assume that each measured fi has a probability distribution 
function (PDF) described by a Gaussian with standard deviation a we can write the likelihood for 
a single observation as 



p(m, Zi\Q,M, ^A, w) oc exp 



(jH ~ ^model(/», Shf, ^A, w)f 

2a? 



(2) 



If the distance observations are independent after calibration such that there are no correlated 
uncertainties we can simply multiply the likelihood of each observation together. By taking the 
logarithm, we can write a more convenient form of the likelihood as follows 



N 



i=i 



{^i ~ MmodelQ^ w))' 



(3) 



where K is an unimportant constant, giving us the familiar x 2 statistic. Note that the use of this 
form of the likelihood function and \ 2 statistic is based on the assumption of independent data 
with normally distributed uncertainties. 

Traditionally when making cosmological inference with SNe la one calculates the y 2 statis 



tic (IConlev et al.ll201ll : iKessler et alJl2009al : IWood-Vasev et al.ll2007l : lAstier et all 2006; R iess et al 



2004 ) . One method of including syst ematic uncertainties in such a framework is to use the "quadra- 
ture" method, accurately named bv lConlev et al.l (|201ll ). Systematic errors which are not redshift 
dependent and add scatter to the overall Hubble diagram are added in quadrature to the statistical 
uncertainties. For other sources of systematic uncertainty it is typical to perform the analysis with 
and without including the systematic effect on the data. The difference in inferred cosmological 
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parameter is then a measure of the systematic uncertainty. All systematic effects are then added 
in quadrature as the qu oted total sys t ematic uncertainty. This method has been used in recen t 
cosmological analyses by iKessler et al.l (|2009al ) , IWood-Vasev et al.l (|2007l ) , and lAstier et al.l (|2006l ) . 
It has the advantage of being simple to implement but the disadvantage of missing correlations 
between systematic uncertainti es, not produc ing the full likelihood, and could be inappropriate for 
asymmetric error distributions (|Barlowll2003l ), One also has the difficult task of estimating the size 
of the systematic uncertainty and implementing its effect in the analysis. 



Conlev et al.l (|201ll ) presented a more thorough approach to incorporating systematic uncer- 
tainties into a x 2 analysis using a covariance matrix. By implementing a covariance matrix one 
can drop the assumption of independent data in Eq. [3l The covariance matrix can be decom- 
posed into a diagonal, statistical component and two off-diagonal matrices which include statistical 
and systematic uncertainty. These off-diagonal covariance matrices include uncertainties from, 
e.g., uncertainty in the supernova model which is statistical in nature but could be correlated be- 
tween different SNela and uncertainty in ze r o poi nts which would systematically affect all SNe la. 

(2008) and lAmanullah et al.1 (|2010l ) present similar methods which are approxima- 



Kowalski et al. 


( 


2008) 


tions to IConlev et al. 



be modified for uncertainties due to, e.g., type contamination and Malmquist bias. They have 
the effect of adding or removing supernovae from the sample which is difficult to represent in a 
covariance matrix. For systematic effects such as these the field of supernova cosmology is moving 
toward calculating the corrections to the data using artificial SNe la generated from Monte Carlo 
simulations. 

Bayesian inference becomes increasingly difficult as we depart from normal error distributions 
or when the likelihood function is not analytically or computationally tractable. Direct calculation 
of the likelihood may involve many integrations over systematic uncertainties, nuisance parameters, 
and latent variables. These integrations can make the use of standard Markov Chain Monte Carlo 
(MCMC) techniques very challenging and computational expensive. It may also be incredibly 
difficult to construct an analytic probability model over which to marginalize. 

ABC allows one to bypass direct calculation of the likelihood by simulating data from the 
posterior distribution. The posterior distribution is then constructed from the model parameters 
necessary to simulate data which resemble the observed data. By incorporating into the simulation 
all of the statistical and systematic uncertainties for which we have models and priors, the simulation 
knows about the complicated probability model even thought the observer may not be able to 
have the model written out as a set of equations or numerical integrals. By simulating many 
realistic datasets one can marginalize over the nuisance parameters and systematic uncertainties 
such that high-dimensional marginalization problems, as in population genetics for which ABC 
techniques were first developed, are now computationally feasible. ABC is a consistent framework 
to incorporate systematic uncertainties with the cosmological model and more clearly defines what 
it means to use Monte Carlo simulations of artificial SNe la to quantify systematic uncertainty. 
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We begin in § [2] by motivating the general problem and discussing the breakdown of current 
cosmological inference methods using a simple example. In § [3] we outline three separate ABC 
algorithms and discuss their merits. To provide the reader with an introductory example of using 
ABC, we then illustrate how one might perform cosmological inference with Sequential Monte Carlo 
(SMC) ABC using the simple model discussed in §[2 In §U]we present a more sophisticated analysis 
using SNe la from the Sloan Digital Sky Survey-II (SDSS-II) Supernova Survey and demonstrate 
how one might perfo rm cosmological inference with a tool like the SuperNova ANAlysis (SNANA) 
(jKessler et alj|2009bl ) softwa re using SMC ABC te chniques. We compare our results to the cosmo- 
logical analysis performed in iKessler et al.l (|2009al ) using statistical errors only. At the end of this 
section we show that ABC can recover the full posterior distribution when we contaminate the data 
with simulated Type IIP supernovae. We discuss directions for future work in § [5] and conclude in 



2. General Problem Formulation 



Here we establish notation that we will use in discussing the SN la inference problem. Below 
we explain how this framework could be extended to other cosmological inference challenges. Let 
/ij be the measured distance modulus of the i th SN la in our sample, t, be its true distance 
modulus, Zi be the estimated redshift, and 9 be the vector of cosmological parameters. We will use 
bold faced variables to indicate a set of n supernovae, e.g., z = {zi, z n }. Here, we stress that 
the "estimated redshift" will be, in practice, the redshift as estimated from photometry, i.e., the 
photometric redshift. 

The underlying objective is to determine the posterior of the cosmological parameters 9 given 
the observed data (/a, z). There are two natural analytical routes, both of which lead to the same 
challenges. The first route is to note that the posterior of 9 can be decomposed as 



7r(0| /x,z) = Kp(fi\ 0,z)tt(0,z) 
where K is a constant that does not depend on 9 and 



P (Ai|0,zMM) 



J 0,z,r)p{r \ 0,z) dr 
j P{f*\ 0,t)p(t\ 9,z) dr 



(4) 



(5) 



(6) 



Note that in this last step, the density of /x conditional on 9 and z is replaced with the density of 
H conditional only on 9. Here we are assuming that \i and z are independent given r: Once r is 
known, the information in z does not affect the distribution of /x. We note that this assumption is 
not true if one is using the photometric redshift determined from the supernova light curve. 

We could pose this problem in general statistical terms as follows. Assume that = {fi±, fi2, • • • , fJ-n} 
are random variables such that the distribution of in is determined by parameters 9 and Tj. Here, 9 
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represents the unknown parameters common to the fa while t% are the object-specific parameters. 
We further assume the existence of additional data, denoted Z(, which have the property that fa 
and Zi are independent conditional on Tj. The quantities Z{ can be thought of as properties that 
help in the estimation of Tj, but would not be useful for estimating if Tj were known. 



Note that each of fa and Tj could be vectors. For example, in lMandel et al.l (120111 ). fa stores the 
full observed light curve of the supernova and Tj comprises not only the true distance modulus, but 
also parameters that capture the effect of extinction and dust and that define the true, underlying 
light curve. As mentioned above, these have the property that, if r« were known, z% would not 
provide useful additional information for the estimation of 6. 

The second route is to rewrite the posterior as 

tt(0|,u,z) = J p(8,T\n,z)dr (7) 

and then rely upon the fact that, as derived above, 

P (0,t\ /x,z) =p(/x| 9,r)p(r\ 9,z)tt(9,z) (8) 

to construct a hierarchical Bayesian model for the unknown "parameters" which now consist of 
both 6 and t. To analytically obtain the posterior in terms of only 9, one must integrate over r, 
i.e., find 

J p( t i\8,T)p(T\0,z)dr. (9) 

This is exactly the form of the challenging integral that was confronted above in Equation ([6]). One 
can often justify further conditional independence assumptions and write 

fp(jM\e,T)p(r\e,*)dT = ff[p(fa\9,T i )p(T i \9,Z i )dT (10) 

J i=l 

n „ 

= Y\ I p{fa\8,Ti) p(Ti\9,Zj) dTi. (11) 
i=l 

Still, the computational feasibility of using analytical approaches to finding the posterior for 8 will 
depend on the form of 

p{fa | 9, z^ = / p(fa | 9, n) p(n | 9, z^ dn. (12) 



In practice, the complex nature of photometric redshift estimators will yield a complex form for 
the distribution p{ji \ 9, Zi). 

An alternative is to adopt the "second route" described above but instead utilize MCMC 



meth ods to simulate from the posterior for both (0, r). This is the approach taken in lMandel et al 



(|201ll ). This avoids the integral over Tj, but it is still apparent that practical implementation of 
analytical or MCMC methods when n is large (and hence r is of high dimension) forces one to 
make choices for p(fa \ 9,Ti) and p(ji \ 9,Zi) which may not be realistic. Unfortunately, as n gets 
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large, even small mistakes in the specification of these densities could lead to significant biases in 
the estimates of the parameters. This is one of the fundamental challenges facing cosmology as we 
are presented with ever-larger data sets. In what follows we will develop an example that illustrates 
this point. 



2.1. A Simple Example 

To begin, note that in the present example //j is the measured distance modulus, zi is the mea- 
sured redshift, Tj is the true distance modulus, and 9 represent the set of cosmological parameters. 
We ignore for the moment all parameters which affect the measured distance modulus except Zi 
and 9. The measured redshift z\ may differ from the true redshift of the supernova, which we will 
denote Q. Consider the following three scenarios: 

1. Zi = Q, i.e., the redshift is known exactly. In this case, and under our simplifying assumptions, 
we know exactly the value of Tj, and hence the "density" p(ji \ 9, z{) is a delta function at this 
known value. 

2. The redshift is observed with some normal error. We model d with a Gaussian PDF with 
mean zi and variance a 2 zi . In this case we can apply the so-called "delta method" and state 
that p(ti \ 9,Zi) is approximately Gaussian with mean fi(zi,9). This scenario is analogous to 
measuring a spectroscopic redshift with a small error such that a Gaussian approximation for 
the PDF of Q is sufficient or a photometric redshift which has a PDF which can be modeled 
well by a Gaussian. 

3. Zi is observed with some complicated uncertainty. The PDF is not described by a simple 
function although p{ji | 9, Zi) may be estimated using observed data. This is the case for most 
photometric redshifts. 

Of course, the first case is unrealistic. In order to demonstrate the pitfalls of making unwarranted 
assumptions regarding the likelihood function, we will first focus on the second case, in particular 
assume that p{ji \ 9, Zi) is a Gaussian density with mean 9). The rationale for this approxima- 
tion relies on the assumption that the true redshift d also has a Gaussian distribution, in this case 
with mean Zi and variance o 2 zi . The true distance modulus is Tj = fj,((i,9), so, using the standard 
linear approximation, we can argue that Tj is approximately normal with mean fi(zi,9) and variance 

2 

(13) 

Then, the observed distance modulus can be modeled as the true distance modulus plus some 
additional Gaussian error; this is taken to have mean zero and variance (cr^j) 2 . In a real-life 
application this variance includes uncertainty from the observed intrinsic dispersion in distance 
modulus and uncertainty from fitting the light curve. 



fe) 2 = 



dfi{Zi,9) 

dzi 
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This is the curr ent approach in most cosmological analyses where one has spectroscopic red 



shifts for each SN la (|Conlev et al.ll201ll : lKessler et al.ll2009al : IWood-Vasev et al.ll2007l : lAstier et al 



2006). The uncertainty in redshift is transferred to the uncertainty in measured distance modulus 
and one can find an analytic solution to Eq. [12] by noting that the integral is simply the convolution 
of two normal densities. Hence the result of Eq. [12] is another normal density, but now with mean 

fj,(zi,9) and variance f^j) + (<?>,i) 2 - This approach is also possible for larger uncertainties like 
those from photometric redshifts, but the concern becomes the fact that the linear approximation 
utilized does not extend to larger ranges of redshift. In what follows we examine the consequences 
of making this Gaussian assumption for photometric redshift uncertainties when the approximation 
is not valid, i.e., we treat scenario 3 as if it were scenario 2. 

Fig.[T]shows the photometric versus spectroscopic redshift for a sample of 1744 SNe la generated 
using SNANA0 version v9_32 and smoothed with a Gaussian kernel. To make this figure, light curves 
were simulated and fit from the MLCS2k2 model ()Jha et al.l 120071 ) as described in Section 2] with 
the following changes; we fix the cosmology to Qa = 0.73, Qm = 0.27 and w = —1, and we estimate 
photometric redshifts when we fit the light curves without using a host galaxy photo-z priori We 
use this sample to represent a realistic joint distribution between the spectroscopic and photometric 
redshifts. We further assume that the spectroscopic redshift is equal to the true redshift C = z spec 
and the observed redshift is the photometric redshift z = ^ phot . 

Fig. [2] shows three cross-sections of the joint distribution of spectroscopic and photometric 
redshifts, comparing the photometric redshift distribution with the assumed Gaussian PDF. Our 
proposed model assumes that the horizontal cross-section of this distribution at z phot is Gaussian 
with mean equal to z spcc . This figure demonstrates that the Gaussian approximation to the dis- 
tribution of z spcc is not terrible. Further, under this Gaussian approximation n(zf hot ,9) should 
be approximately normal with mean Tj, i.e., under the linear approximation the distance modulus 
estimated using the photometric redshift has mean equal to the true distance modulus. Fig. [3] uses 
boxplots to show the distribution of Tj — ^(zf hot , 6) at various values of z phot for the simulated data. 
This plot reveals that there are significant deviations from the expected difference of zero. 

The effect of this bias is made clear in Fig. [4] This figure shows the 95% credible region as 
constructed by two different methods, which will be described below. In both cases, the data set 
utilized is the same. To construct this data set we simulated a sample of 200 SNe la by drawing with 
replacement from the (z spec , z p]lot ) sample shown in Fig. [I] We then calculated r = ^{z spec ^ 9), where 
9 consists of w = —1 and Qm = 0.27 and assumed a flat universe. Finally the observed distance 
modulus ji is constructed by adding mean-zero Gaussian error onto r with variance of, ■ = 0.04. 



i-t.i 



E http : //sdssdp62 . f nal . gov/ sds s sn/ SMAM A-PUBLIC/ 1 



6 Please see Section 4.9 of the SNANA manual for details on measuring SN la redshift from photometry 
http : //sdssdp62 . f nal . gov/sdsssn/SNANA-PUBLIC/doc/ snana_manual . pdf 
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The posterior for 6 is found for this dataset in two ways, and the 95% credible region^ is displayed 
for each. 

1. The solid line shows the credible region if the posterior is constructed using z spec . It will serve 
as the fiducial reference for comparisons to the other region. 

2. The dashed line is the credible region that results from using the approximation described 
above, i.e., assuming that the observed distance modulus has a Gaussian PDF with variance 

2 

^Phot, + o* ti . (14) 



8. 



phot 



The point of emphasis here is that the additional uncertainty in the redshift is now taken into 
account and reflected in the extra width of the region as compared to the solid region. The 
shift from the solid region to the dashed region is the result of a bias. 

The bias shown in Fig. 0] is much like the attenuation bias that results from inappropriately 
taking into account the errors in the predictor variables in a regression setting: simply adding more 
error into the response will not adequately account for this additional error. There are methods 
for dealing with this additional error, but these are not practical in this setting because of another 
fundamental challenge: the variance of the error in redshift cannot be assumed to be constant, 
it needs to be modeled as a function of redshift. This heteroskedastic error introduces significant 
obstacles to any method that would seek to "back out" its effect on the estimates. In the next 
section we will instead consider approaches that exploit our ability to model and/or simulate the 
forward process that generated the data, and hence allow us to incorporate in a natural way the 
errors due to the use of photometric redshifts. 



3. Approximate Bayesian Computation 

ABC methods simulate observations from the posterior distribution via algorithms that bypass 
direct calculation of the likelihood. This is done by drawing model parameters from some distri- 
bution, generating simulated data based on these model parameters and reducing the simulated 
data to summary statistics. Summary statistics are measures of the data designed to reduce the 
dimensionality of the data: they represent the maximum amount of information in the simplest 
form. Model parameters that generate data sufficiently similar to the observed data are drawn 
from the posterior distribution. This procedure allows one to simulate the complicated integral in 
Eq. [T2l rather than evaluate it but instead relies upon the ability to simulate the forward process 
that generated the observed data. 



7 The region which comprises 95% of the probability under the posterior is referred to as a credible region to 
distinguish it from a frequentist confidence region. 
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Here we review two classes of ABC algorithms; ABC rejection samplers and adaptive ABC 
algorithms. The roots of ABC techniques lie in the first class while the goal of adaptive ABC algo- 
rithms is to efficiently determine the relevant regions of parameter and probability space to sample 
from. In this section we will adopt a Bayesian approach and endeavor to determine (approximately) 
the posterior distribution of model parameters 9 given observed data x. The posterior is given by 



7T 



p(x| 0)tt(0) 
p(x) 



(15) 



where p(x| 9) is the likelihood fu nction and jofx) is a normalization constant. For a review on ABC 
algorithms we refer the reader to iMarin et al.1 (|201ll ) . 



3.1. ABC Rejection Samplers 

The basic ABC prescription is best considered for a situation in which the data x are discrete: 



Rejection Sampler: Discrete Case 

1. Draw candidate 9* from ir(0) 

2. Simulate data x* ~ p(x* | 9*) 

3. Accept 9* if x* = x 

Repeat these steps until ./V candidates are accepted. 



Under this algorithm, the probability that 9* is accepted is exactly tt(9 \ x). Hence, it is simple in 
principle to generate a sample of size N from the posterior distribution. This sample is then used 
to estimate properties of the posterior distribution such as the 95% credible region. 

In practice, however, most data are continuous, and we must instead decide to accept 9* if x* 
is suitably "close to" x; hence, a metric or distance A(x*,x) must be chosen. Under this setup, 
the accepted parameter vectors 9* are drawn from the posterior distribution conditioned on the 
simulated data being sufficiently close to the observed data. More precisely, the result will be a 
sample from the joint distribution p(x,9\ A(x, x*) < e) where e > is a fixed tolerance. If e is 
small and one marg inalizes over x, then p{9 \ A(x, x*) < e) is a reasonable approximation to tt(9 \ x) 



(jSisson et al.l 120071 ) . Note that if e is very large the sample will be effecti vely drawn fr o m the 
prio r. The continuous ver sion of the ABC rejection sampler, introduced by iTavare et al.l (119971 ) 
and iPritchard et al.1 (I1999I ). is built upon this idea: 
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Rejection Sampler: Continuous Case 

1. Draw candidate 9* from tt(9) 

2. Simulate data x* ~ p(x* | 9*) 

3. Accept 9* if A(x*,x) < e 

Repeat these steps until TV candidates are accepted. 



If the data have many dimensions, requiring that A(x*,x) < e may be impractical. For 
example, it would be nearly impossible to simulate 10 3 supernovae to within e of the observed data 
even with the correct cosmology due to random photometric error, let alone population variance in 
realizations of stretch and color distribution. 



Fu Lil (|1997l ) and I Weiss &: von Haeselerl ([1998) improved Step 3 by instead making the com- 



parison between lower-dimensional summaries of the data; here these will be denoted s(x), or just 
s. The ideal choice for s would be a summary statistic that is a sufficient statistic for estimating 
9. Technically, a vector s is sufficient if p(x| s,9) is not a function of 9, and hence the posterior 
conditioned on s is the same as the posterior conditioned on x, i.e., tt(9\s) = 7r(#|x). Of course, one 
cannot expect to derive an exactly sufficient statistic when the form of the likelihood is not known. 
Hence, much current research in ABC is focused on the derivation of approximately sufficient 
statistics or, more gen erally, summary sta tistics that preserve important information regarding the 
parameters of interest. I Blum et al.1 (J2012J) provide an excellent overview and comparison of methods 
for constructing summary statistics. These methods generally fall into two categories: those that 
sift through a list of candidate summary statistics to find the "best" summary statistic as measured 
by some optimality criterion, and those that utilize the ability to simulate data sets under different 
parameter values as part of a process of fitting a regression where the responses are the parameters, 
and the predictors are the simulated data. This mapping is then used to transform observed sum - 
mary statistics to parameters. For example, an early such example was iBeaumont et al.l (|2002l ). 
who fit local linear regression to simulated parameter values on simulated summary statistics . The 
regression approach can b e justified on theoretical grounds, see iFearnhead &; Prangld (120121 ) . and 
Cameron &: Pettittl (120121 ) used this approach for their astronomical application. In our work, the 
relatively simple structure of the relationship between the simulated data and the parameters of 
interest leads to a natural approach to constructing a summary statistic: exploiting the known 
smooth distance modulus/redshift relationship. In other applications, there will not exist such a 
simple one-dimensional representation of the data, and these sophisticated approaches must be 
utilized. 

There are advantages to the general ABC rejection sampler approach. Since each accepted 
parameter represents an independent draw from p(9\ A(s*,s) < e), properties of the posterior 
distribution are easily estimated from the accepted sample. There are no problems with such 
estimation due to dependence in the sample. Also, the ABC rejection sampler is simple to code 
and trivial to parallelize. However, the success of this method depends on how easy it is to simulate 
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data from the model. If the model is complicated or if the acceptance rates are small, then the 
algorithm can be very expensive or inefficient. A low acceptance rate can be caused by a diffuse 
prior relative to the posterior or by a poor choice for the summary statistic. It is natural to consider 
approaches that do not rely upon independent sampling from the prior. In particular, one would 
anticipate that it would be possible to "learn" from the parameter values that have been accepted 
in the past to determine where good choices for future candidates 9*. 



3.2. Adaptive ABC Algorithms 



The aforementioned challenges are the major motivations for the use of MCMC techniques: 
instead of relying on rand om draws from a distri bution to produce candidates, random walks are 
taken in parameter space. iMarioram et al.l ([20031 ) presented an MCMC version of ABC as follows: 



ABC MCMC 

Initialize 9i, i = 1 
For i=l to i=N do: 

1. Propose a move to 9* according to a transition kernel q(9i 

2. Simulate x* ~ p(x* | 0*) 

3. Measure s* from x* 

4. If A(s*, s) < e proceed, else go to Step 1 

5. Set Oi+i = 9* with probability 



h(9i 



min 1 



7T(9*)q(9i 



7T 



and otherwise, 9i + \ = 9{ 
6. i = i + 1 



Here q{9i — > 9*) is a proposal density, h(9i,6*) is the Metropolis-Hastings acceptance probability 
and N is the ch ain length. The chain l ength is determine d after meeting some c onvergence cri- 
terion (see, e.g., ICowles &: Carlinl ([1996)). As is proved in iMarjoram et al.l (|2003l ). the posterior 
distribution of interest tt(9 \ x) is the stationary distribution of the chain. 

The MCMC ABC algorithm can be much more efficient than the ABC rejection sampler, 
especially when the posterior and prior distributions are very different. This efficiency is gained, 
however, at the cost of highly correlated #j. Additionally, the MCMC ABC sampler can become 
inefficient if it wanders into a region of parameter space with low acceptance probability with a 
poor perturbation kernel. Successive perturbations have a small chance or being accepted and the 
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chain can get "stuck." It is worth noting that this algorithm is replacing the likelihood ratio present 
in standard MCMC techniques with a one or zero based on whether or not A(s*, s) < e. This is a 
significant loss of resolution in the information that was present in the likelihood ratio. 



Sisson et al.l (|2007l ) (improved upon by lBeaumont et al.l (|2009l )) overcome the inefficiencies of a 



MCMC ABC algorithm via a method which they term Population Monte Carlo or Sequ ential Monte 



Carlo (SMC) ABC. The SMC ABC approach adapts the SMC methods developed in lMoral et al 



(|2006l ) to ABC. The algorithm learns about the target distribution using a set of weighted random 
variables that are propagated over iterations, similar to running parallel MCMC algorithms which 
interact at each iteration. The basic recipe of the SMC ABC algorithm is to initialize N points 
in parameter space according to tt(6). Points or particles are drawn from this sample, slightly 
perturbed, and are accepted for the next iteration if they meet the e criterion. For each iteration, 
the tolerance e is decreased, slowly migrating the N particles into the correct region of parameter 
space when we have reached a pre-specified tolerance threshold. 
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SMC ABC 

Fix a decreasing sequence of tolerances e = ei, £2, er 
For the first iteration, t=l: 



For i=l to z=iV do: 

1. Draw 9\ from 7r(0) 

2. Simulate x • ~ p (xj | 0|) 

3. Measure s* from x* 

4. Proceed if A(s|, s) < et, else return to Step 1 

5. Set Wi = l/N 

6. i = % + 1 

Take equal to twice the weighted variance of the set {9\ : i = 1, ...,N}. 



For t=2 to i=T do: 



For i=l to z=iV do: 



1. Draw 9* from {0* 1 : j = 1, N} with probabilities {w* x } 



2. Generate 0| frorxi 'K(9*,T t 

3. Simulate x| ~ p 

4. Measure from x 



5. Proceed if A(s*, s) < else return to Step 1 

6. Set 



Wi oc 



t-i 



7. i = i + l 

Take r t 2 +1 equal to twice the weighted variance of the set {9\ : i = 1, iV} 



Here, is a kernel which could be, e.g., a Gaussian kernel such that K(x) oc exp(— x 2 / 2) an d 

the weights are normalized after TV" points have been selected. Following iBeaumont et al.l ((2.009), 
each particle is perturbed using a multivariate normal distribution with mean centered on the 
particle's current position 9* and variance equal to twice the weighted empirical covariance matrix 
of the previous iteration N{9*,rf). Some work has been invested determining the most efficient 
method of perturbing points and includes implementing a locally adapted covariance matrix and 
incorporating an estimate of the Fisher information (see lFilippi et al.l (|201ll )). 



Since the target distribution is approximated by a random sample of iV particles that have 
migrated over iterations, properties of the posterior distribution are again properties of the sample, 
i.e., there is no covariance between the points as in the MCMC case. Using the importance weighting 
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scheme in iBeaumont et al.l (j2009l ) along with the distribution of particles in parameter space allows 
one to construct an estimate of the posterior distribution and derive estimates of parameters of 
interest based on this posterior. 

SMC ABC has some distinct advantages over the other ABC methods. Both the ABC re- 
jection sampler and the MCMC ABC scheme become very inefficient when the tolerance is small. 
SMC ABC derives its efficiency instead from sequentially learning about the target distribution by 
decomposing the problem into a series of simpler sub-problems. The sequence of e's can be chosen 
such that the acceptance rates are never too poor and the algorithm converges at a reasonable 
rate. However, if the sequence of e decreases too slowly the algorithm will be too computationally 
expensive and if it decreases too rapidly the acceptance rates will be too small. An inefficient 
perturbation kernel will also result in a poor exploration of the space and similarly poor acceptance 
rates as many simulated datasets will be generated before A(s|, s) < et is reached. 



ABC is an active field of research. Recent improvements have been made by I Barnes et al 



(|201ll ). who employ an information-t heoretical framework to construct approximately sufficient 
statistics and iBlum fc Francois! (|2010i ). who introduce a machine learning approach to estimate 
the posterior by fitting a nonlinear conditional heteroscedastic regression of the parameters on the 
summary statistics. The estimation is then adaptively improved using importance sampling. For a 
re yiew and study of t he improvements made in ABC methods in recent years we refer the reader 
to Marin et al. (2011). 



3.3. Example: Revisited 

Here, we apply SMC ABC to the stylized SN la inference example introduced in §[2j The model 
is the same as was specified in that section. The "observed data" are simulated by constructing a 
sample of 200 SNe la under a flat cosmology with = 0.27 and w = —1. For this toy example, 
Hq is assumed to be perfectly known as 72 km s _1 Mpc -1 . 

Fig. [5] depicts key steps in the SMC algorithm as applied to this situation. The prior is chosen 
to be uniform over the region < &m < 1 and —3 < w < 0. A collection of 500 (Ojvf, w) pairs, often 
called particles in the context of SMC methods, is migrated through the iterations of the algorithm. 
Fig. [5^, shows the collection of 500 particles at the conclusion of one of the early time steps. One 
of these particles is chosen at random and perturbed a small amount; the parameter combination 
is Qm = 0.11 and w = —1.21, and is shown as the star in the plot. This parameter combination 
in denoted 9j in the algorithm above. Simulated data are created by drawing a collection of 200 
(z, z') pairs, sampling with replacement, from the collection shown in Fig. [U With Of specified 
and the 200 true redshifts, it is trivial to calculate the distance modulus of each SN la, and then 
add uncertainty using a Gaussian PDF with variance (c^j) 2 = 0.04. Fig. [5)3 shows the resulting 
simulated distance moduli plotted against the photometric redshifts z. The point is that this is 
a plot that can be created using observable data: these data comprise the x* that appear in the 
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algorithm above. 

A key step in any implementation of SMC ABC is the choice of the summary statistic. Here, 
the summary statistic s\ is found by applying a nonparametric regression smoother through these 
data; this curve is shown in Fig. [5b. (The approach used to perform this smooth is briefly presented 
in the Appendix.) The motivation for this choice is as follows: as stated above, ideally we would 
choose a sufficient statistic as our summary statistic. A sufficient statistic is a summary that 
separates out from the full data that portion which is useful for estimating 9. In this case, we 
know that the relationship between redshift and distance modulus for fixed 9 is a smooth curve. 
The deviation of the data from a smooth curve can be solely attributed to random error in the 
measurements, error which is not all informative of the value of 9. For this reason, it is reasonable to 
believe that a smoothed version of the points shown in Fig. [5b captures all of the useful information 
for estimating 9. 

The comparison between the real and simulated data will be done via these smooth curves. 
Fig. [5^ shows the observed data, along with the result s of applying the same smoothing procedure 
to these data. Finally, in Fig. [5H, these two curves are compared via a simple distance calculation 
between these curves, namely, the sum of squared deviations across the length of the curve. The 
particle is accepted in this iteration, because even though the curves differ at high redshift, the 
tolerance is not sufficiently small yet to reject at this difference. Fig. [6] shows how the collection 
500 particles evolves over the steps of the algorithm. As the steps progress, the particles converge 
in and approximate a sample from the posterior. The notable feature of this result is that this 
posterior is centered on the solid contours. Just as in Fig. 01 these contours represent the posterior 
as derived by someone who had full knowledge of the redshifts. It is clear that by avoiding the 
unjustified Gaussian assumptions made in § [21 the bias that was present in the previous posterior 
based on photometric redshifts has been removed. 



4. SMC ABC Cosmology with SDSS-II Supernovae 



In this section we apply SMC ABC to fi rst year data from the SDSS-II Supernova Sur- 



vey (jHoltzman et al J 120081 ; 



Kessler et aL 2009a). The developm ent of the sophisticated supernova 



simulation and analysis software SNANA (IKessler et al.ll2009bl ) has made possible the comparison 
between the SDSS-II supernova sample and simulated data sets and is a natural first choice to 
test ABC methods in cosmology. The purpose of this section is to demonstrate that ABC can be 
used to estimate an accurate posterior distribution. We use the spectroscopically confirmed sample 
to estimate cosmological parameters from assuming a spatially flat universe and a constant dark 
energy equation of state parameter, w. In this section we discuss how we create simulated data 
sets, our ABC setup, and compare our posterior distributions for the matter density VLm and the 
equation of state parameter w with those from a x 2 analysis using statistical errors only. We close 
this section demonstrating the full utility of ABC by including Type IIP supernovae contamination 
to the SDSS sample and estimating the correct posterior distribution with ABC. 
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4.1. Simulation Setup 



For this analysis we will use d ata from the fall 2005 SDSS-II Supernova Survey which were 
published in iHoftzman et al.l (|2008l ) . For detailed in formation regarding t he s cientific goals and 
data processing for the survey we refer the reader to iFrieman et al.l (|2008l ). to ISako et al.l ([2008) 



for details of the sup ernova search algorithms and spectroscopic observations and to Section 2 of 
Kessler et al.l ( 2009al ) for a brief summary of the survey. 



Our goal is to c ompare the derived posterior distributions for Qm and w using ABC with those 
from iKessler et al.1 ( 2009al ) which were done using a more traditional % 2 analysis. To make this 
comparison as meaningful as possible we apply the same relevant selection cuts to the data. There- 
fore, defining to as the time of peak brightness in rest-frame B according to MLCS2k2 such that 
t — to = 0, we require for each SN la light curve, one measurement before peak brightness and one 
measurement more than 10 days after peak brightness. Additionally we require five measurements 
with — 15 < t — to < 60 days. These r equirem ents ensure adequate time sampling to yield a robust 
light-curve model fit. IKessler et al.l (j2009al ) additionally require one measurement in gri with a 
signal-to-noise greater than 5 to put a floor on the quality of data and require Pgt > 0.001, where 
Pfit is MLCS2k2 light-curve fit probability based on x 2 - This requirement is designed to remove 
obvious peculiar SNe la in an objective fashion. 

All supernovae in this sample have unambiguous spectroscopic confirmation and we use pho- 
tometry in q, r, and i bands. This leaves us with 103 SDSS SNe la. This sample is identical to 
Kessler et al.l (j2009al )'s sample A and can be taken from their Table 10. 



We can broadly separate the treatment of variables in the likelihood into two categories: (1) 
those which are of cosmological interest and (2) nuisance parameters. One will be able to construct 
posterior distributions for all parameters in the first category, in this case = [Qm,w], while 
sampling from the probability space spanned by the set of nuisance parameters when generating 
simulated data sets. 

We use SNANA to simulate sets of supernovae from different cosmologies. The idea is to ran- 
domly sample from the probability distributions of each nuisance parameter every time a simulated 
set of supernovae is generated. If we were to fix the cosmology and simulate many data sets, the 
probability space spanned by the nuisance parameters should be reflected in the variance of the 
sets of simulated data. 



Within SNANA we will use the MLCS2k2 model (jjha et all 120071 ) to simulate SN la light 
curves. We use the same modified version of MLCS2k2 that was developed and trained in lKessler et al 
(|2009al ). In this model the observed model magnitudes corrected for Galactic extinction, K- 
correction, and time dilation, for each passband, X, are given by 



m x (t-to) = M° x +fi + Cx(a x + ^jA 



+P X A + Q X A 2 



(16) 
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where M^- are the fiducial absolute magnitudes, fiQ is the distance modulus, Ry and Ay are the 
host galaxy extinction parameters, and Pj and Qx describe the change in light-curve shape as a 
quadratic function of A. Quantities that are functions of phase are in bold. M^, Pj, and Qx are 
estimated from a training set leaving to, /j,q, A, Ay, and Ry as the free parameters. 

The distance modulus can be related to the luminosity distance for a flat universe with a 
constant dark energy equation of state parameter of w = — 1 in the following way 



Mo 



5 log (dx/lOpc) 

5 log (c(i + z) jf [n M (i + z'f + n A ) 1/2 dz'^j 

-5 log #0 + 25. 



(17) 
(18) 



Note that a change in Hq simply scales the distance modulus. It is easy to see that if one rewrites 
Eq. [16] in terms of luminosity distance that a degeneracy arises between Hq and My. Even if Hq 
is known from some other experiment, My would still need to be marginalized over. 



£x is defined as 



X 



(19) 



and is equal to unity at maximum light. This framework allows one to separate out the time 
dependence of the extinction while being insensitive to the total extinction E(B — V) and the 
extinction law Ry. 

A major advantage of MLCS2k2 is that it allows one to separate reddening resulting from 
dust in the host galaxy (third term in Eq. fT6|) from intrinsic color variations of the supernova 
which are captured by A. The validity of this approach depends on how separable these two terms 
are, how well intrinsic color is predicted by light curve shape, an d relies on accurate models of the 
distribution of extinction with redshift (jWood-Vasev et al.1 120071 ). 



To generate a simulated set of data, we assume a flat universe and choose Qm and w from flat 
priors over the range [0, 1] and [—3, 0] respectively. One could instead draw c osmological parameters 
from priors based on the SDSS detection of the baryon acoustic oscillations (jEisenstein et al.l l2005) 
and the five-year Wilkin son Microwave Anisotropy Probe observations (WMAP-5) of the cosmic 
microwave background (IKomatsu et al.ll2009l ). A random supernova redshift is selecte d from a 
power law distribution given by g ~ (1 + zf where /3 = 1.5 ± 0.6 dDildav et alJl200fih . A and 



Ay are then drawn from empi rical distributions det ermined in Section 7.3 of 



Kessler et al. 



Using the parameterization of Carde lli et al.l (119890 to describe the extinction with Ry 



12009?; 
2.18 (as 

determined from Section 7.2 in lKessler et al.ll2009al ). the MLCS2k2 light-curve model can now be 
used to generate s upernovae magnitudes which are then if-corrected using spectral templates from 
Hsiao et al.l (|2007l ) into observer frame magnitudes. 



SNANA then chooses a random sky coordinate consis tent with the observed survey area and 
applies Galactic extinction using the ISchlegel et al.1 (jl998l ) dust maps, chooses a random date for 
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peak brightness, and selects observed epochs from actual SDSS survey observations. Noise is 
simulated for each epoch and filter and includes Poisson fluctuations from the SN la flux, sky 
background, CCD read noise, and host galaxy background. 

The simulation allows one to add additional intrinsic variations in SN la properties to better 
match the observed scatter in the Hubble diagram. We do this by "color smearing." A magnitude 
fluctuation drawn from a Gaussian distribution is added to the rest-frame magnitude for each 
passband leading to a change in model colors of ~ 0.1 mag. SNANA also includes options to model 
the search efficiency of the survey. 

The aforementioned selection cuts on the observed data are then applied to the simulated data. 
This process is done for a selected cosmology for ~ 100 SN la over the redshift range of [0.02, 0.45], 
similar to the SDSS data, assuming a redshift uncertainty of 0.0005. Finally, the distance modulus 
is measured by performing an MLCS2k2 light curve fit assuming the same prior on Ay and A from 
which the data were simulated. 

In Fig. [7]we plot the distance modulus as a function of redshift for the SDSS data in blue and a 
simulated data set in red. For the simulated data set we assume that SIm = 0.3 and w = —1.0. The 
simulated data have been offset by 1 mag for clarity. The distance modulus uncertainties, intrinsic 
scatter, and redshift distributions are similar between the simulated and observed data sets. 

4.2. SMC ABC Implementation 

To calculate the measure of similarity between the observed and simulated data sets, A(s|, s), 
we turn to the Hubble diagram. In the top panel of Fig. [8] we show [i versus z for our observed 
data and a simulated data set with £Im = 0.1, and w = —2.0. A reasonable distance measure could 
be the Euclidean distance between the data sets at the redshifts of the observed data. However, in 
keeping with the notion of summary statistics, we would like to compare a smooth representation 
of the two data sets rather than the data themselves. In the bottom panel of Fig. [8] we show a 
non-parametric smooth of the simulated and observed data. The details on how we perform the 
non-parametric smooth are in Appendix El We opt for a non-parametric smooth in the interests of 
efficiency and to prevent inserting additional assumptions about the data in an intermediate step 
in contrast to fitting the data with a cosmology fitter. We now define A(s*,s) to be the median 
absolute deviation between the smoothed data sets evaluated at the observed redshifts. We choose 
this because it is simple, it is robust to poor smoothing at high and low redshifts, and allows for 
a physical interpretation of the minimum tolerance. Since we are basically measuring the distance 
between the two data sets in distance modulus, we consider our minimum tolerance to be equal 
to the median uncertainty in the smoothed observed data, i.e., we declare the observed data and 
simulated data sets sufficiently similar when the simulated data are within the error of the observed 
data. 

For simplicity in this analysis we fix the value of Hq to 65 km s" 1 Mpc -1 to restrict the 
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relevant region of parameter space. This improves the efficiency of the ABC algorithm and more 
importantly, makes the comparison between ABC and x 2 more striking. However, we note that for 
this particular definition of the distance metric the simulated value of Hq directly scales A(s*, s) in a 
trivial manner. One could naively treat Hq as a nuisance parameter and randomly sample Hq from 
a flat prior over some range. Since Hq, w, and Q.m are correlated, a faster approach would be to 
add Hq as another cosmological parameter, adding a third dimension to the parameter space. The 
particles would then trace out the three-dimensional posterior distribution from which one could 
marginalize over Hq to obtain the two-dimensional projection. Given the simulation expense, one 
would like to take advantage of the simple relationship between Hq and A(s*,s). To this end one 
could calculate a set of A(s*, s)s corresponding to a range of Hubble parameter values for a given w 
and Qm- The particle is then accepted with a percentage based on the number of A(s', s) elements 
that meet the tolerance criterion. This avoids re-simulating data sets a given number of times over 
a range of Hq values while still sampling the probability space fully and thus marginalizing over 

Hq. 

We choose et according to the distribution of {A(s* _1 ,s) : i = 1,...,N} instead of having a 
predefined sequence of tolerances to walk though. For the first iteration, we accept all points, i.e., 
the tolerance is infinite. For the next iteration, t = 2, the tolerance et=2 is set to the 25% percentile 
of {A(s l 1 ,s)}. All subsequent es are the 50% percentile of the previous iteration. A percentile 
which is too large allows for many acceptances and will not localize into the correct region until 
T is large. Conversely, if one is too strict in their sequence of tolerances, many simulations are 
required before a point is accepted. We found that putting a stricter cut on what e should be early 
on helps concentrate quickly into the correct area of parameter space, requiring fewer simulations 
in future iterations. 

We define e to be sufficiently small when it is less than the uncertainty on the non-parametric 
smooth of the observed data, which we estimate via bootstrap. The median uncertainty on the 
non-parametric smooth for the SDSS data set is 0.033. We require A(s*,s) for each particle to be 
less than this value at the final iteration. 

We choose N = 150 particles and run the code on eight different processors. As the initial 
particles are independently drawn between the three runs, the results can be combined to better 
estimate the posterior distribution. However, the sequence in e is slightly different for each run. In 
practice one should parallelize the code at the level of accepting N points so that there is just one 
sequence of tolerances. Ours do not vary significantly and is not a concern for our demonstration. 

Properties of the posterior distribution are then drawn from the final sample of particles and 
their weights which meet the minimum tolerance criteria. 
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4.3. Results and Discussion 



It is useful to first review the cosmological analysis performed in lKessler et al.l (|2009al ). MLCS2k2 
provides an estimate of the distance modulus for each supernova. The x 2 statistic is then calculated 
over a grid of model parameters and used to derive cosmological parameter estimates. Recall that 
— 21n(7r(#| x)) = x 2 - The x 2 statistic for the SDSS supernova sample is calculated according to 



°2 



where fa and Z\ are the distance modulus returned from MLCS2k2 and measured redshift of the 
supernova, and /i mo d is the model magnitude. The distance modulus uncertainties are given by 



°1=(<) 2 + «Y + KY (2i) 

where cr™" is the statistical uncertainty reported by MLCS2k2, a 1 ^ = 0.16 is additional intrinsic 
error, and 

< = °* (mlo) z(l + z/2Y (22) 

The posterior distributions for VLm and w assuming a flat universe can then be found by marginal- 
izing over Hq. Recall for our comparison that we are fixing the value of Hq and do not need to 
marginalize over Hq. 

In Fig. [9] we compare our posterior distribution to that found using the approach described 
above. The top plot shows the particles from the final iteration of the SMC ABC algorithm. The 
area of the particle symbol represents the weight. These points and their weights represent a sample 
from the posterior distribution. We estimate the 95% credible region from this sample and compare 
with the 95% confidence region from a x 2 analysis in the bottom plot. Overall the contours are 
well matched. The weights on the particles become large just inside the hard boundaries set by 
the priors on Qm and w. The algorithm is accounting for the fact that there is parameter space 
beyond the boundary which it cannot explore. This is similar to an MCMC algorithm running into 
a boundary and sampling more in that region because it cannot cross the boundary. As a result 
the ABC contours become wider than those from x 2 near the boundaries. 

We reiterate that the goal of this exercise was not to derive new cosmol ogical constra i nts bu t 



merely to see how well we can recover the likelihood contours presented in iKessler et al.l ([2009a) 
using a simple implementation of SMC ABC. We demonstrate that we can recover the posterior 
distribution derived from current analysis techniques with the hope of convincing the reader this 
approach will be useful in the near future. We do note that the A in ABC stands for "Approximate." 
One should expect slight differences in the estimated posterior distributions due to choices of 
distance metric, summary statistics, and final tolerance. 
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4.4. Type IIP Contamination 

We add 34 simulated Type IIP supernovae to the SDSS sample so that the overall type contam- 
ination is 25%. While the amount and type are a bit extreme it is useful for illustrative purposes. 
We use SNANA to simulate the data which uses spectral templates and smoothed light curves of 
well observed supernovae. We use the "NONIa" option which computes the observer magnitudes 
from the spectral energy distribution and we set MAGOFF=-0.6 and MAGSMEAR=0.9. For de- 
tails on these keywords and additional information on simulating non-la light curves we refer the 
reader to Section 3.5 of the SNANA manually The selection cuts, other observing parameters, and 
fitting procedure remain as described in Section 14.11 Our new sample is plotted in Figure [TU1 

We modify our SMC ABC analysis as follows; after drawing cosmology parameters from n(9), 
we simulate and fit additional Type IIP light curves in the aforementioned manner and add those 
to our simulated Type I data. From this point the SMC ABC algorithm proceeds as before. Our 
new final tolerance has increased to 0.038 due to the additional scatter in the Hubble diagram. 

The resulting 95% credible region is plotted in Fig. [11] as the blue-solid line along with the 
95% confidence regions from \ 2 with (red-dashed) and without (black-dotted) type contamination. 
The contours from the \ 2 analysis have shifted due to the type contamination. One can attempt to 
fix this bias with simulations about the best fit value but one can use SMC ABC to reproduce the 
full bias-correct contours. The ABC contours are 42% larger in area than the x 2 uncontaminated 
contours, but cover essentially the same area as the original ABC contours from the uncontaminated 
sample. If contamination is properly modeled the ABC method is robust against these effects that 
can only be applied on a population basis rather than as a per-object correction. 

It is worthwhile to note that while the division between statistical and systematic errors is 
often loosely used to make a distinction between uncertainties that will decrease with more data of 
the same form versus uncertainties that will not decrease with larger sample sizes, the benefit of a 
forward-modeling framework is that they can be treated consistently and simultaneously. To create 
a simulation model one is forced to make choices regarding the distributions of all statistical and 
systematic uncertainties through either analytic or empirical methods. Systematic errors come in 
at least three flavors: (1) effects that we know and understand and have a reasonable understanding 
of the relevant input distribution; (2) effects we qualitatively understand, but for which we do not 
have a good input prior distribution: e.g., Ry values in host galaxies. We can compute the effect 
on a supernova lightcurve, but we are relatively uncertain about the correct distribution of Ry in 
galaxies in the Universe; (3) effects that we lose sleep over but that we have so little understanding 
of that we cannot model their effects at all, although we may have some purely empirical guidance: 
spectroscopic selection biases; evolving metallicity content of stars over the last 8 billion years. 
Systematic errors of type 1 are easy to include in ABC. One can use ABC to examine the effects on 
the posterior distribution from different choices of distributions for systematic errors of type 2. One 



^http : //sdssdp62 . f nal . gov/sdsssn/SNANA-PUBLIC/doc/snana_manual . pdf 
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may be able to include empirical distributions for systematic errors of type 3. Otherwise ABC can 
not tell you something about these systematic errors unless they are treated as model parameters. 
Forward modeling with an SMC ABC approach provides a powerful way to fully incorporate all 
available knowledge and ignorance. 

5. Future Work 

We presented here a proof of concept for an SMC ABC method to infer model parameters 
based on SN la measurements. To fully deploy this method will require an incorporation of all 
data sets and modeling relative systematics between the surveys, e.g., relative calibration. This is 
tractable, if somewhat tedious, and has been done with varying degrees of completeness already 
in the literature. Extending this approach to explorations of time-variable dark energy is a simple 
matter of implementing at different generating model for luminosity distance as a function of 
redshift. 

For future photometric-focused surveys, we would explore more fully the non-Gaussianity of 
photometric redshifts as derived from calibration samples. The probability distributions for these 
photometric redshifts will be strongly affected by evolution of the contamination fraction of non- 
SN la with redshift. Once that is phrased as part of the generating model, ABC will incorporate 
such uncertainties on the same basis as all of the other cosmological and astrophysical parameters. 

The ABC+SNANA framework is a very suitable vehicle for testing the effects of different 
lightcurve fitters on the derived cosmological parameters. ABC will help efficiently determine what 
different parameter choices in the fitters should be explored. 

But the real long-term goal would be to apply the summary statistic comparison at the indi- 
vidual lightcurve level. This could significantly reduce the computing time. The analysis presented 
in this work with ~100 supernovae and 1200 particles required ~600 CPU-hours. We estimate that 
a realistic problem with a sample of 10 4 supernovae could be done on O(10) CPU-years, which is 
within reasonable computing resources. Applying the summary statistic comparison at the indi- 
vidual light curve level rather than in Hubble diagram space bypasses fitting the simulated light 
curves which currently requires most (~90%) of the computing time. 

Comparing the simulated and observed data at the individual lightcurve level would also be 
the cleanest framework to explore agreement and evolution of systematics. The only "training" 
would be in the generation of the templates that the SN la are derived from in the first place. The 
cosmological distance and supernova property comparison would be finally integrated in one direct 
comparison. 



6. Conclusions 



We have introduced and demonstrated the use of Approximate Bayesian Computation tech- 
niques to address the requirements for analyzing near-future SN la cosmological data sets. ABC 
presents a consistent and efficient approach to explore multi-dimensional non-Gaussian parameter 
distributions with full incorporation of systematic uncertainties. 



• Forward modeling is often the only way to correctly incorporate the full range of statistical 
and systematic uncertainties in some of the big astronomy questions being addressed today. 

• Calculation of likelihood functions for evaluation in a traditional Markov Chain Monte Carlo 
approach may not be analytically tractable. 

• ABC allows for a simultaneous exploration of parameter space and tolerance to create credible 
regions for physical parameters of interest without the need to construct an explicit likelihood 
function. 

• Sequential Monte Carlo ABC offers an efficient way to explore the full parameter space of all 
important input parameters and model effects. 

• The use of a summary statistic focuses attention directly on the ability to discriminate model 
parameter values in the relevant space of observed values. 



We encourage scientists facing similar problems to consider the use of ABC techniques to 
increase their incisive power to explore the complicated parameter spaces that are surrounding the 
key questions in astrophysics and cosmology today. 
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A. Non-Parametrically Smoothing the Simulated and Observed Data 



To perform a non-parametric smooth we use a robust locally weighted regression (loess) ([Cleveland 



19791 ) . This routine smooths the data by iteratively fitting a local d-order polynomial to the data 



using a tricube weighting function. We use a quadratic polynomial and, for the observed data, add 
an additional weight according to the uncertainty in fi given by Eq. EU 
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We choose the size of the window to locally smooth over by minimizing the risk or the sum of 
the variance and bias squared. We estimate the risk using the leave-one-out cross validation score 



R(h) 



(Al) 



where f(x) is the smoothed function using a smo othing window give n by h and f(— Xi ) is the smooth 



obtained leaving out the i th data point (see, e.g.. IWassermanl (|2006l )). The smoothing window goes 
from zero to one with zero being no smooth and one resulting in a line. Using the SDSS data we 
find the minimum risk to yield a smoothing window of 0.52. As estimating the risk is somewhat 
computationally intensive, we determine the smoothing window using the observed data and use 
the same window to smooth the simulated data in the ABC algorithm. 



REFERENCES 

Amanullah, R., Lidman, C, Rubin, D., Aldering, C, Astier, P., Barbary, K., Burns, M. S., Conley, 
A., Dawson, K. S., Deustua, S. E., Doi, M., Fabbro, S., Faccioli, L., Fakhouri, H. K., Folatelli, 
C, Fruchter, A. S., Furusawa, H., Garavini, C, Goldhaber, G., Goobar, A., Groom, D. E., 
Hook, I., Howell, D. A., Kashikawa, N., Kim, A. G., Knop, R. A., Kowalski, M., Linder, 
E., Meyers, J., Morokuma, T., Nobili, S., Nordin, J., Nugent, P. E., Ostman, L., Pain, R., 
Panagia, N., Perlmutter, S., Raux, J., Ruiz-Lapuente, P., Spadafora, A. L., Strovink, M., 
Suzuki, N., Wang, L., Wood-Vasey, W. M., Yasuda, N., & Supernova Cosmology Project, 
T. 2010, ApJ, 716, 712 

Astier, P., Guy, J., Regnault, N., Pain, R., Aubourg, E., Balam, D., Basa, S., Carlberg, R. G., 
Fabbro, S., Fouchez, D., Hook, I. M., Howell, D. A., Lafoux, H., Neill, J. D., Palanque- 
Delabrouille, N., Perrett, K., Pritchet, C. J., Rich, J., Sullivan, M., Taillet, R., Aldering, 
G., Antilogus, P., Arsenijevic, V., Balland, C, Baumont, S., Bronder, J., Courtois, H., Ellis, 
R. S., Filiol, M., Gongalves, A. C, Goobar, A., Guide, D., Hardin, D., Lusset, V., Lidman, 
C, McMahon, R., Mouchet, M., Mourao, A., Perlmutter, S., Ripoche, P., Tao, C, & Walton, 
N. 2006, A&A, 447, 31 

Barlow, R. 2003, ArXiv Physics e-prints 

Barnes, C, Filippi, S., Stumpf, M. P. H., k Thorne, T. 2011, ArXiv e-prints, 1106.6281 
Beaumont, M. A., Cornuet, J.-M., Marin, J.-M., & Robert, C. P. 2009, Biometrika, 96, 983 
Beaumont, M. A., Zhang, W., & Balding, D. J. 2002, Genetics, 162, 2025 
Blum, M. G. B. & Francois, O. 2010, Statistics and Computing, 20, 63 



Blum, M. G. B., Nunes, M. A., Prangle, D., & Sisson, S. A. 2012, ArXiv e-prints, 1202.3819 



-26- 



Cameron, E. & Pettitt, A. N. 2012, ArXiv e-prints, 1202.1426 
Cardelli, J. A., Clayton, G. C, k Mathis, J. S. 1989, ApJ, 345, 245 

Cleveland, W. S. 1979, Journal of the American Statistical Association, 74, 829 linklabel 

Conley, A., Guy, J., Sullivan, M., Regnault, N., Astier, P., Balland, C, Basa, S., Carlberg, R. G., 
Fouchez, D., Hardin, D., Hook, I. M., Howell, D. A., Pain, R., Palanque-Delabrouille, N., 
Perrett, K. M., Pritchet, C. J., Rich, J., Ruhlmann-Kleider, V., Balam, D., Baumont, S., 
Ellis, R. S., Fabbro, S., Fakhouri, H. K., Fourmanoit, N., Gonzalez-Gaitan, S., Graham, 
M. L., Hudson, M. J., Hsiao, E., Kronborg, T., Lidman, C, Mourao, A. M., Neill, J. D., 
Perlmutter, S., Ripoche, P., Suzuki, N., & Walker, E. S. 2011, ApJS, 192, 1 

Contreras, C, Hamuy, M., Phillips, M. M., Folatelli, G., Suntzeff, N. B., Persson, S. E., Stritzinger, 
M., Boldt, L., Gonzalez, S., Krzeminski, W., Morrell, N., Roth, M., Salgado, F., Jose 
Maureira, M., Burns, C. R., Freedman, W. L., Madore, B. F., Murphy, D., Wyatt, P., Li, 
W., & Filippenko, A. V. 2010, AJ, 139, 519 

Cowles, M. K. & Carlin, B. P. 1996, Journal of the American Statistical Association, 91, pp. 883 

|linkJ 

Dilday, B., Kessler, R., Frieman, J. A., Holtzman, J., Marriner, J., Miknaitis, G., Nichol, R. C, 
Romani, R., Sako, M., Bassett, B., Becker, A., Cinabro, D., DeJongh, F., Depoy, D. L., 
Doi, M., Garnavich, P. M., Hogan, C. J., Jha, S., Konishi, K., Lampeitl, H., Marshall, 
J. L., McGinnis, D., Prieto, J. L., Riess, A. G., Richmond, M. W., Schneider, D. P., Smith, 
M., Takanashi, N., Tokita, K., van der Heyden, K., Yasuda, N., Zheng, C, Barentine, J., 
Brewington, H., Choi, C, Crotts, A., Dembicky, J., Harvanek, M., Im, M., Ketzeback, 
W., Kleinman, S. J., Krzesihski, J., Long, D. C, Malanushenko, E., Malanushenko, V., 
McMillan, R. J., Nitta, A., Pan, K., Saurage, G., Snedden, S. A., Watters, S., Wheeler, 
J. C, & York, D. 2008, ApJ, 682, 262 

Eisenstein, D. J., Zehavi, I., Hogg, D. W., Scoccimarro, R., Blanton, M. R., Nichol, R. C, Scranton, 
R., Seo, H.-J., Tegmark, M., Zheng, Z., Anderson, S. F., Annis, J., Bahcall, N., Brinkmann, 
J., Buries, S., Castander, F. J., Connolly, A., Csabai, I., Doi, M., Fukugita, M., Frieman, 
J. A., Glazebrook, K., Gunn, J. E., Hendry, J. S., Hennessy, G., Ivezic, Z., Kent, S., Knapp, 
G. R., Lin, H., Loh, Y.-S., Lupton, R. H., Margon, B., McKay, T. A., Meiksin, A., Munn, 
J. A., Pope, A., Richmond, M. W., Schlegel, D., Schneider, D. P., Shimasaku, K., Stoughton, 
C, Strauss, M. A., SubbaRao, M., Szalay, A. S., Szapudi, I., Tucker, D. L., Yanny, B., & 
York, D. G. 2005, ApJ, 633, 560 

Fearnhead, P. & Prangle, D. 2012, Journal Of The Royal Statistical Society Series B, 74 

Filippi, S., Barnes, C, & Stumpf, M. P. H. 2011, ArXiv e-prints, 1106.6280 



-27- 



Frieman, J. A., Bassett, B., Becker, A., Choi, C, Cinabro, D., DeJongh, F., Depoy, D. L., Dilday, 

B. , Doi, M., Garnavich, P. M., Hogan, C. J., Holtzman, J., Im, M., Jha, S., Kessler, R., 
Konishi, K., Lampeitl, H., Marriner, J., Marshall, J. L., McGinnis, D., Miknaitis, G., Nichol, 
R. C, Prieto, J. L., Riess, A. G., Richmond, M. W., Romani, R., Sako, M., Schneider, 
D. P., Smith, M., Takanashi, N., Tokita, K., van der Heyden, K., Yasuda, N., Zheng, C, 
Adelman-McCarthy, J., Annis, J., Assef, R. J., Barentine, J., Bender, R., Blandford, R. D., 
Boroski, W. N., Bremer, M., Brewington, H., Collins, C. A., Crotts, A., Dembicky, J., 
Eastman, J., Edge, A., Edmondson, E., Elson, E., Eyler, M. E., Filippenko, A. V., Foley, 
R. J., Frank, S., Goobar, A., Gueth, T., Gunn, J. E., Harvanek, M., Hopp, U., Ihara, Y., 
Ivezic, Z., Kahn, S., Kaplan, J., Kent, S., Ketzeback, W., Kleinman, S. J., Kollatschny, W., 
Kron, R. G., Krzesihski, J., Lamenti, D., Leloudas, G., Lin, H., Long, D. C, Lucey, J., 
Lupton, R. H., Malanushenko, E., Malanushenko, V., McMillan, R. J., Mendez, J., Morgan, 

C. W., Morokuma, T., Nitta, A., Ostman, L., Pan, K., Rockosi, C. M., Romer, A. K., 
Ruiz-Lapuente, P., Saurage, G., Schlesinger, K., Snedden, S. A., Sollerman, J., Stoughton, 
C, Stritzinger, M., Subba Rao, M., Tucker, D., Vaisanen, P., Watson, L. C, Watters, S., 
Wheeler, J. C, Yanny, B., & York, D. 2008, AJ, 135, 338 

Fu, Y. X. & Li, W. H. 1997, Molecular Biology and Evolution, 14, 195 

Ganeshalingam, M., Li, W., Filippenko, A. V., Anderson, C, Foster, G., Gates, E. L., Griffith, 
C. V., Grigsby, B. J., Joubert, N., Leja, J., Lowe, T. B., Macomber, B., Pritchard, T., 
Thrasher, P., & Winslow, D. 2010, ApJS, 190, 418 

Hicken, M., Challis, P., Kirshner, R. P., Rest, A., Cramer, C. E., Wood-Vasey, W. M., Bakos, G., 
Berlind, P., Brown, W. R., Caldwell, N., Calkins, M., Currie, T., de Kleer, K., Esquerdo, G., 
Everett, M., Falco, E., Fernandez, J., Friedman, A. S., Groner, T., Hartman, J., Holman, 
M. J., Hutchins, R., Keys, S., Kipping, D., Latham, D., Marion, G. H., Narayan, G., Pahre, 
M., Pal, A., Peters, W., Perumpilly, G., Ripman, B., Sipocz, B., Szentgyorgyi, A., Tang, S., 
Torres, M. A. P., Vaz, A., Wolk, S., & Zezas, A. 2012, ApJS, 200, 12 

Hicken, M., Wood-Vasey, W. M., Blondin, S., Challis, P., Jha, S., Kelly, P. L., Rest, A., & Kirshner, 
R. P. 2009, ApJ, 700, 1097 

Holtzman, J. A., Marriner, J., Kessler, R., Sako, M., Dilday, B., Frieman, J. A., Schneider, D. P., 
Bassett, B., Becker, A., Cinabro, D., DeJongh, F., Depoy, D. L., Doi, M., Garnavich, P. M., 
Hogan, C. J., Jha, S., Konishi, K., Lampeitl, H., Marshall, J. L., McGinnis, D., Miknaitis, 
G., Nichol, R. C, Prieto, J. L., Riess, A. G., Richmond, M. W., Romani, R., Smith, M., 
Takanashi, N., Tokita, K., van der Heyden, K., Yasuda, N., & Zheng, C. 2008, AJ, 136, 2306 

Hsiao, E. Y., Conley, A., Howell, D. A., Sullivan, M., Pritchet, C. J., Carlberg, R. G., Nugent, 
P. E., & Phillips, M. M. 2007, ApJ, 663, 1187 



Jha, S., Riess, A. G., & Kirshner, R. P. 2007, ApJ, 659, 122 



-28- 



Kessler, R., Becker, A. C, Cinabro, D., Vanderplas, J., Frieman, J. A., Marriner, J., Davis, T. M., 
Dilday, B., Holtzman, J., Jha, S. W., Lampeitl, H., Sako, M., Smith, M., Zheng, C, Nichol, 
R. C, Bassett, B., Bender, R., Depoy, D. L., Doi, M., Elson, E., Filippenko, A. V., Foley, 
R. J., Garnavich, P. M., Hopp, U., Ihara, Y., Ketzeback, W., Kollatschny, W., Konishi, 
K., Marshall, J. L., McMillan, R. J., Miknaitis, G., Morokuma, T., Mortsell, E., Pan, K., 
Prieto, J. L., Richmond, M. W., Riess, A. G., Romani, R., Schneider, D. P., Sollerman, 
J., Takanashi, N., Tokita, K., van der Heyden, K., Wheeler, J. C, Yasuda, N., & York, D. 
2009a, ApJS, 185, 32 

Kessler, R., Bernstein, J. P., Cinabro, D., Dilday, B., Frieman, J. A., Jha, S., Kuhlmann, S., 
Miknaitis, G., Sako, M., Taylor, M., & Vanderplas, J. 2009b, PASP, 121, 1028 

Knop, R. A., Aldering, G., Amanullah, R., Astier, P., Blanc, G., Burns, M. S., Conley, A., Deustua, 
S. E., Doi, M., Ellis, R., Fabbro, S., Folatelli, G., Fruchter, A. S., Garavini, G., Garmond, 
S., Garton, K., Gibbons, R., Goldhaber, G., Goobar, A., Groom, D. E., Hardin, D., Hook, 
I., Howell, D. A., Kim, A. G., Lee, B. C, Lidman, C., Mendez, J., Nobili, S., Nugent, P. E., 
Pain, R., Panagia, N., Pennypacker, C. R., Perlmutter, S., Quimby, R., Raux, J., Regnault, 
N., Ruiz-Lapuente, P., Sainton, G., Schaefer, B., Schahmaneche, K., Smith, E., Spadafora, 
A. L., Stanishev, V., Sullivan, M., Walton, N. A., Wang, L., Wood-Vasey, W. M., & Yasuda, 
N. 2003, ApJ, 598, 102 

Komatsu, E., Dunkley, J., Nolta, M. R., Bennett, C. L., Gold, B., Hinshaw, G., Jarosik, N., Larson, 
D., Limon, M., Page, L., Spergel, D. N., Halpern, M., Hill, R. S., Kogut, A., Meyer, S. S., 
Tucker, G. S., Weiland, J. L., Wollack, E., & Wright, E. L. 2009, ApJS, 180, 330 

Kowalski, M., Rubin, D., Aldering, G., Agostinho, R. J., Amadon, A., Amanullah, R., Balland, C, 
Barbary, K., Blanc, G., Challis, P. J., Conley, A., Connolly, N. V., Covarrubias, R., Dawson, 
K. S., Deustua, S. E., Ellis, R., Fabbro, S., Fadeyev, V., Fan, X., Farris, B., Folatelli, G., 
Frye, B. L., Garavini, G., Gates, E. L., Germany, L., Goldhaber, G., Goldman, B., Goobar, 
A., Groom, D. E., Haissinski, J., Hardin, D., Hook, I., Kent, S., Kim, A. G., Knop, R. A., 
Lidman, C, Linder, E. V., Mendez, J., Meyers, J., Miller, G. J., Moniez, M., Mourao, A. M., 
Newberg, H., Nobili, S., Nugent, P. E., Pain, R., Perdereau, O., Perlmutter, S., Phillips, 
M. M., Prasad, V., Quimby, R., Regnault, N., Rich, J., Rubenstein, E. P., Ruiz-Lapuente, P., 
Santos, F. D., Schaefer, B. E., Schommer, R. A., Smith, R. C, Soderberg, A. M., Spadafora, 
A. L., Strolger, L.-C, Strovink, M., Suntzeff, N. B., Suzuki, N., Thomas, R. C, Walton, 
N. A., Wang, L., Wood-Vasey, W. M., & Yun, J. L. 2008, ApJ, 686, 749 

Lampeitl, H., Nichol, R. C, Seo, H.-J., Giannantonio, T., Shapiro, C, Bassett, B., Percival, W. J., 
Davis, T. M., Dilday, B., Frieman, J., Garnavich, P., Sako, M., Smith, M., Sollerman, J., 
Becker, A. C, Cinabro, D., Filippenko, A. V., Foley, R. J., Hogan, C. J., Holtzman, J. A., 
Jha, S. W., Konishi, K., Marriner, J., Richmond, M. W., Riess, A. G., Schneider, D. P., 
Stritzinger, M., van der Heyden, K. J., Vanderplas, J. T., Wheeler, J. C, & Zheng, C. 2010, 
MNRAS, 401, 2331 



-29- 



Law, N. M., Kulkarni, S. R., Dekany, R. G., Ofek, E. 0., Quimby, R. M., Nugent, P. E., Surace, 
J., Grillmair, C. C, Bloom, J. S., Kasliwal, M. M., Bildsten, L., Brown, T., Cenko, S. B., 
Ciardi, D., Croner, E., Djorgovski, S. G., van Eyken, J., Filippenko, A. V., Fox, D. B., Gal- 
Yam, A., Hale, D., Hamam, N., Helou, G., Henning, J., Howell, D. A., Jacobsen, J., Laher, 
R., Mattingly, S., McKenna, D., Pickles, A., Poznanski, D., Rahmer, G., Rau, A., Rosing, 
W., Shara, M., Smith, R., Starr, D., Sullivan, M., Velur, V., Walters, R., & Zolkower, J. 
2009, PASP, 121, 1395 

LSST Science Collaborations, Abell, P. A., Allison, J., Anderson, S. F., Andrew, J. R., Angel, 
J. R. P., Armus, L., Arnett, D., Asztalos, S. J., Axelrod, T. S., & et al. 2009, ArXiv e-prints, 
0912.0201 

Mandel, K. S., Narayan, G., & Kirshner, R. P. 2011, ApJ, 731, 120 

Marin, J.-M., Pudlo, P., Robert, C. P., & Ryder, R. 2011, ArXiv e-prints, 1101.0955 

Marjoram, P., Molitor, J., Plagnol, V., &: Tavare, S. 2003, Proceedings of the National Academy of 
Science, 1001, 15324 

Miknaitis, G., Pignata, G., Rest, A., Wood-Vasey, W. M., Blondin, S., Challis, P., Smith, R. C, 
Stubbs, C. W., Suntzeff, N. B., Foley, R. J., Matheson, T., Tonry, J. L., Aguilera, C, 
Blackman, J. W., Becker, A. C, Clocchiatti, A., Covarrubias, R., Davis, T. M., Filippenko, 
A. V., Garg, A., Garnavich, P. M., Hicken, M., Jha, S., Krisciunas, K., Kirshner, R. P., 
Leibundgut, B., Li, W., Miceli, A., Narayan, G., Prieto, J. L., Riess, A. G., Salvo, M. E., 
Schmidt, B. P., Sollerman, J., Spyromilio, J., & Zenteno, A. 2007, ApJ, 666, 674 

Moral, P. D., Doucet, A., &: Jasra, A. 2006, Journal Of The Royal Statistical Society Series B, 68, 
411 jj [LINK] | 

Perlmutter, S., Aldering, G., Goldhaber, G., Knop, R. A., Nugent, P., Castro, P. G., Deustua, S., 
Fabbro, S., Goobar, A., Groom, D. E., Hook, I. M., Kim, A. G., Kim, M. Y., Lee, J. C, 
Nunes, N. J., Pain, R., Pennypacker, C. R., Quimby, R., Lidman, C, Ellis, R. S., Irwin, 
M., McMahon, R. G., Ruiz-Lapuente, P., Walton, N., Schaefer, B., Boyle, B. J., Filippenko, 
A. V., Matheson, T., Fruchter, A. S., Panagia, N., Newberg, H. J. M., Couch, W. J., & The 
Supernova Cosmology Project. 1999, ApJ, 517, 565 

Pritchard, J. K., Seielstad, M. T., Perez-Lezaun, A., & Feldman, M. W. 1999, Molecular Biology 
and Evolution, 16, 1791 

Riess, A. G., Filippenko, A. V., Challis, P., Clocchiatti, A., Diercks, A., Garnavich, P. M., Gilliland, 
R. L., Hogan, C. J., Jha, S., Kirshner, R. P., Leibundgut, B., Phillips, M. M., Reiss, D., 
Schmidt, B. P., Schommer, R. A., Smith, R. C, Spyromilio, J., Stubbs, C, Suntzeff, N. B., 
& Tonry, J. 1998, AJ, 116, 1009 



- 30 - 



Riess, A. G., Strolger, L.-G., Tonry, J., Casertano, S., Ferguson, H. C, Mobasher, B., Challis, P., 
Filippenko, A. V., Jha, S., Li, W., Chornock, R., Kirshner, R. P., Leibundgut, B., Dickinson, 
M., Livio, M., Giavalisco, M., Steidel, C. C, Bem'tez, T., & Tsvetanov, Z. 2004, ApJ, 607, 
665 

Sako, M., Bassett, B., Becker, A., Cinabro, D., DeJongh, F., Depoy, D. L., Dilday, B., Doi, M., 
Frieman, J. A., Garnavich, P. M., Hogan, C. J., Holtzman, J., Jha, S., Kessler, R., Konishi, 
K., Lampeitl, H., Marriner, J., Miknaitis, G., Nichol, R. C, Prieto, J. L., Riess, A. G., 
Richmond, M. W., Romani, R., Schneider, D. P., Smith, M., Subba Rao, M., Takanashi, 
N., Tokita, K., van der Heyden, K., Yasuda, N., Zheng, C, Barentine, J., Brewington, 
H., Choi, C, Dembicky, J., Harnavek, M., Ihara, Y., Im, M., Ketzeback, W., Kleinman, 
S. J., Krzesihski, J., Long, D. C, Malanushenko, E., Malanushenko, V., McMillan, R. J., 
Morokuma, T., Nitta, A., Pan, K., Saurage, G., & Snedden, S. A. 2008, AJ, 135, 348 

Schlegel, D. J., Finkbeiner, D. P., & Davis, M. 1998, ApJ, 500, 525 

Sisson, S. A., Fan, Y., & Tanaka, M. M. 2007, Proceedings of the National Academy of Science, 
104, 1760 

Stritzinger, M. D., Phillips, M. M., Boldt, L. N., Burns, C, Campillay, A., Contreras, C, Gonzalez, 
S., Folatelli, G., Morrell, N., Krzeminski, W., Roth, M., Salgado, F., DePoy, D. L., Hamuy, 
M., Freedman, W. L., Madore, B. F., Marshall, J. L., Persson, S. E., Rheault, J. -P., Suntzeff, 
N. B., Villanueva, S., Li, W., & Filippenko, A. V. 2011, AJ, 142, 156 

Tavare, S. A., Balding, D. J., Griffiths, R. C, & Donnelly, P. 1997, Genetics, 145, 505 

Wasserman, L. 2006, All of Nonparametric Statistics (233 Spring Street, New York, NY 10013, USA: 
Springer Science+Businesse Media, Inc.) 



Weiss, G. & von Haeseler, A. 1998, Genetics, 149, 1539 [LINK 



Wood-Vasey, W. M., Miknaitis, G., Stubbs, C. W., Jha, S., Riess, A. G., Garnavich, P. M., 
Kirshner, R. P., Aguilera, C, Becker, A. C, Blackman, J. W., Blondin, S., Challis, P., 
Clocchiatti, A., Conley, A., Covarrubias, R., Davis, T. M., Filippenko, A. V., Foley, R. J., 
Garg, A., Hicken, M., Krisciunas, K., Leibundgut, B., Li, W., Matheson, T., Miceli, A., 
Narayan, G., Pignata, G., Prieto, J. L., Rest, A., Salvo, M. E., Schmidt, B. P., Smith, R. C, 
Sollerman, J., Spyromilio, J., Tonry, J. L., Suntzeff, N. B., & Zenteno, A. 2007, ApJ, 666, 
694 



This preprint was prepared with the AAS IATjjX macros v5.2. 



- 31 - 




Spec z 

Fig. 1. — Photometric vs. spectroscopic redshift for 1744 simulated SNe la using SNANA and 
smoothed with a Gaussian kernel. Note the complex structure and asymmetry about the one-to- 
one line indicating departures from Gaussianity. This sample is used to represent a realistic joint 
distribution between the spectroscopic and photometric redshifts. 
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Fig. 2. — Comparison between the assumed Gaussian joint distribution between z and Ci (dashed) 
and nonparametric fits (solid) through the simulated data shown in Fig. [TJ Three cross-sections 
are shown, one at each of photometric redshifts of 0.1, 0.2, and 0.3. In each case, a bin of width 
0.02 is constructed, centered on these values, and the observations which fall into this bin are used 
to estimate the distribution for spectroscopic redshift. A Gaussian is not a terrible approximation 
to these cross-sections, but is it far from ideal. 
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Fig. 3. — Distance modulus residual defined as fi(Ci, 6)—fi(zi, 6) as a function of photometric redshift 
Z{. Under the described Gaussian approximation, these distributions should all have mean zero. The 
boxplots compare the distribution in different narrow redshift bins. The top and bottom of the box 
indicate the 25th and 75th percentile, the center line marks the median, and the "whiskers" mark 
1.5 times the inter-quartile range. Points outside the whiskers are considered outliers. The "notch" 
in each boxplot allows for comparison to determine statistical significance: if the notches of two 
boxes do not overlap, then there is a statistically significant difference between the medians of the 
populations. Hence, it is evident that there is a bias introduced; the centers of these distributions 
are not always zero. This bias indicates that the Gaussian model for the joint distribution of (£, z) 
is inappropriate. 
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Fig. 4. — Comparison between the 95% credible regions for a simulated set of supernova formed by 
taking two approaches: (1) where the true redshift is known (black-solid line) and (2) where the 
approximation described in Section 2.1 is utilized (blue-dashed line). The star is at the true value 
of the parameters used in the simulation. The increased width of the confidence region is natural, 
given the use of photometric redshifts instead of spectroscopic redshifts, but the bias is a result of 
the inadequacy of the assumed Gaussian model. 
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Fig. 5. — Illustration of key steps of the SMC ABC algorithm in the example. Panel (a): a 
collection of 500 particles plotted in the relevant parameter space from an intermediate iteration of 
the SMC ABC algorithm. A random particle is selected, plotted as the star, and perturbed a small 
amount. Panel (b): the simulated dataset corresponding to the perturbed particle from panel a. 
The line is a nonparametric smooth of the data and represents the summary statistic. Panel (c): 
"Observed" data. The dashed line represents a nonparametric smooth of the observed data. Panel 
(d): a comparison between the simulated and observed datasets via the sum of squared deviations 
across the length of the curve. The particle is accepted in this iteration even though the curves are 
discrepant at high redshift as the tolerance is not small enough to reject it. Such a point would 
likely be rejected in a future iteration as the tolerance is decreased (see Fig. [6]). 
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Fig. 6. — Progress of the ABC SMC algorithm in estimating the posterior distribution for the toy 
example. As e decreases, the collection of particles converges to a sample from the posterior (when 
the weights are taken into account). The solid contour is the 95% credible region that would have 
been formed by someone who had knowledge of the spectroscopic redshifts. The dashed contours 
result from fitting to the output of the ABC algorithm. Compare with Fig. [4] to note the reduction 
of the bias that resulted from the Gaussian approximation. Note that it is not expected that these 
contours will be the same, as the ABC simulations are built upon data using photometric redshifts; 
hence, there is additional uncertainty in the parameter estimates. 
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Fig. 7. — Hubble diagram for the observed data in blue and a simulated dataset in red. The 
simulated dataset is offset from the observed data by 1 mag and was generated assuming $7 a/ = 0.3 
and w = —1.0. The distance modulus uncertainties, intrinsic scatter, and redshift distributions are 
well reproduced in the simulated dataset. Simulated datasets like this one with different cosmologies 
are used in our ABC analysis. 
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Fig. 8. — Illustration of the distance metric using SDSS data and simulated data from SNANA. 
Top: Hubble diagram for the observed data in blue and a simulated dataset in red. The simulated 
data were generated assuming O^f = 0.1 and w = —2.0. Bottom: Nonparametric smooth of the 
two datasets. The distance metric is defined to be the median absolute deviation between the 
smoothed curves which is equal to 0.402 for this case. Our final tolerance is 0.033. 
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Fig. 9. — Comparison of SMC ABC with a x 2 analysis. Top: Particles from the final ABC 
iteration. Bottom: The 95% credible regions from ABC (blue-solid) and x 2 (red-dashed). The 
contours between ABC and x 2 are wen matched except near the boundaries. The discrepancy 
results from the sharp boundaries of our prior. ABC is attempting to account for the fact that 
there is relevant parameter space which it cannot explore. 
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Fig. 10. — SDSS sample plus 34 Type IIP supernovae simulated with SNANA. This combined 
dataset is our "observed" sample for the type contamination analysis. 




Fig. 11. — 95% credible region from ABC (blue-solid) and the 95% confidence interval from x 2 
for the SDSS sample with type contamination (red-dashed) and the original SDSS sample (black- 
dotted). The type contamination biases the x 2 result. ABC reproduces the entire credible region 
without this bias and reflects additional uncertainty due to increased scatter in the Hubble diagram. 



